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Abstract 

We consider the problem of an ensemble Kalman filter when only partial obser- 
vations are available. In particular we consider the situation where the observational 
space consists of variables which are directly observable with known observational error, 
and of variables of which only their climatic variance and mean are given. To limit the 
variance of the latter poorly resolved variables we derive a variance limiting Kalman 
filter (VLKF) in a variational setting. We analyze the variance limiting Kalman filter 
for a simple linear toy model and determine its range of optimal performance. We 
explore the variance limiting Kalman filter in an ensemble transform setting for the 
Lorenz-96 system, and show that incorporating the information of the variance of some 
un-observable variables can improve the skill and also increase the stability of the data 
assimilation procedure. 

1 Introduction 

In data assimilation one seeks to find the best estimation of the state of a dynamical system 
given a forecast model w i th a p ossible model error and noisy observations at discrete obser- 
vation intervals ( Kalnayl . 120021 ) . This process is complicated on the one hand by the often 



chaotic nature of the underlying nonlinear dynamics leading to an increase of the variance of 
the forecast, and on the other hand by the fact that one often has only partial information 
of the observables. In this paper we address the latter issue. We consider situations whereby 
noisy observations are available for some variables but not for other unresolved variables. 
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However, for the latter we assume that some prior knowledge about their statistical climatic 
behaviour such as their variance and their mean is available. 



A particularl y attract i ve fra mework for data assimilation are ensemble Kalman filters 
(see for example lEvensenl (120061 )). These straightforwardly implemented filters distinguish 
themselves from other Kalman filters in that the spatially and temporally varying back- 
ground error covariance is estimated from an ensemble of nonlinear forecasts. Despite the 
ease of implementation and the flow- dependent estimation of the error covariance en s emble 
Kalman filters are subject to several errors and specific difficulties (see lEhrendorferl (120071 ) 



for a recent review). Besides the problems of estimating model error which is inherent to all 
filters, and inconsistencies between the filter assumptions and reality such as non-Gaussianity 
which render all Kalman filters suboptimal, ensemble based Kalman filters have the specific 
problem of sampling errors due to an insufficient size of the ensemble. These errors usually 
underestimate the error covariances which may ultimately lead to filter divergence when the 
filter trusts its own forecast and ignores the information given by the observations. 



To counteract the associated small spread of the ensemble several techniques have been 
developed. To deal with errors in ensemble filters due to sampling errors we mention two 
of the main algorithms, covariance inflation and localisation. To avoid filter divergence 
due to an underestimation of error covariances the concept of covariance inflation was in- 
troduced whereby the prior fo recast error covariance is increased by an inflation factor 
( lAnderson and Anderson! . Il999l ) . This is usually done in a global fashion and involves careful 
and expensive tuning of the inflation factor; however recently method s have been devised to 



adaptive l y esti mate the inflation factor from the innovation statistics ( lAndersonl . 120071 . 12009 



Li et all . 120091 ). Too small ensemble sizes also lead to spurious correlations associated with 



remote observations. To address this i ssue the concept of localization has been introduced 



(Ho utekamer and Mitchell! Il998l . 1200 ll : lHamill et all l200ll : lOtt et all 12004 ; ISzunvogh et al 



20051 ) whereby only spatially close observations are used for the innovations. 



To take into account the u ncertainty in the model representation we mention here iso tropic 
model error parametri zation ([Mitchell and Houtekamer! . l2000l ; lHoutekamer et a/.l.l2005r). stochas- 
tic parametrizations ( jBuizza et all 119991 ) and kinet ic energy backscat t er (IShuttsl. 120051). A 



recent comparis on between those methods is given in lHoutekamer et al\ ( 120091 ) ; ICharron et al. 



(l2010h . see a lso|HamiU_and Whitakerl (l2005h. T he pro blem of non-Gaussianity is for example 
discussed in Pires et al. (l2010h : lBocquet et all dioioh . 



Whereas the underestimation of error covariances has received much attention, relatively 
little is done for a possible overestimation of error covariances. Overestimation of covariance 
is a finit e ensemble size e f fect which typically oc curs in sparse observation networks (see for 
example iLiu et all (120081 ) ; IWhitaker et al\ ( 120091 ) ) . Uncontrolled growth of error covariances 
which is not tempered by available observations may progressively spoil the overall analysis. 
This effect is even exacerbated when inflation is used; in regions where no observations influ- 
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ence the analysis, inflation can lead to unrealis tically large ensemble v ariances progressively 
degrading the overall analysis (see for example IWhitaker et al\ (120041 ) ). This is particularly 
problematic when inappropriate uniform inflation is used. Moreover, it is well known that 
covarian ce localization can be a significan t source of imb l ance in the analyzed field s (see for 
example iHoutekamer and Mitchell! ( 120051 ); Kepertl (120091 ) ; iHoutekamer et all (120091 )). Local- 
ization artificially generates unwanted gravity wave activity which in poorly resolved spatial 
regions may lead to an unrealistic overestimation of error covariances. Being able to control 
this should help filter performances considerably. 



When assimilating current weather data in numerical schemes for the troposphere, the 
main problem is underestimation of error covariances rather than overestimation. This is due 
to the availability of radiosonde data which assures wide observational coverage. However, in 
the pre-radiosonde era there were severe data voids, particularly in the southern hemisphere 
and in vertical resolution since most observations were done on the surface level in the north- 
ern hemisphere. There is an increased in t erest in so called climate reanalysis (see for example 



( Bengtsson et ai , 2007 ; Whitaker et ai , 20041 )). which has the challenge to deal with large 



unobserved regions. Historical atmospheric observations are reanalyzed by a fixed forecast 
scheme to provide a global homogeneous dataset covering troposphere and stratosphere for 
very long period s. A remarkable eff ort is the international Twentieth Century Reanalysis 
Project (20CR) ( Compo et all 2011 ). which produced a global estimate of the atmosphere 
for the entire 20th century (1871 to the present) using only synoptic surface pressure reports 
and monthly sea-surface temperature and sea-ice distributions. Such a dataset could help 
to analyze climate variations in the twentieth century or the multidecadal variations in the 
behaviour of the El-Nino-Southern Oscillation. An obstacle for rean alysis is the overestima- 



tion of error covariances if one chooses to employ ensemble filters (IWhitaker et al 
where multiplicative covariance inflation is employed). 



(120041 ) 



Overestimation of error covariances occurs also in modern numerical weather forecast 
schemes for which the upper lid of the vertical domain is constantly pushed towards higher 
and higher levels to incorporate the mesos phere, with the aim to b e tter resolve p r ocesse s 
in the polar stra t osphe re (see for example iPolavarapu et all (120051 ); ISankey et all (120071 ); 
Eckermann et al\ (120091 )). The energy spectrum in the mesosphere is, contrary to the tropo- 
sphere, dominated by gravity waves. The high variability associated with these waves causes 
very large error covar iances in the mesospher e which can be 2 orders of magnitude larger 
than at lower levels (IPolavarapu et all 120051 ). rendering the filter very sensitive to small 
uncertainties in the forecast covariances. Being able to control the variances of mesospheric 
gravity waves is therefore a big challenge. 



The question we address in this work is how can the statistical information available for 
some data which are otherwise not observable, be effectively incorporated in data assimi- 
lation to control the potentially high error covariances associated with the data void. We 
will develop a framework to modify the familiar Kalman filter (see for example (jEvensenl . 
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20061 : ISimonl . l2006j )) for partial observations with only limited information on the mean and 



variance, with the effect that the error covariance of the unresolved variables cannot exceed 
their climatic variance and their mean is controlled by driving it towards the climatological 
value. 

The paper is organized as follows. In Section [2] we will introduce the dynamical setting 
and briefly describe the ensemble transform Kalman filter (ETKF), a special form of an 
ensemble square root filter. In Section [3] we will derive the variance limiting Kalman filter 
(VLKF) in a variational setting. In Section H] we illustrate the VLKF with a simple linear 
toy model for which the filter can be analyzed analytically. We will extract the parameter 
regimes where we expect VLKF to yield opti mal performan ce. In Section [5] we apply the 



VLKF to the 40-dimensional Lorenz-96 system (lLorenzl . 119961 ) and present numerical results 
illustrating the advantage of such a variance limiting filter. We conclude the paper with a 
discussion in section |6j 

2 Setting 

Assume an iV-dimensiona{j] dynamical system whose dynamics is given by 

* = /(*), (1) 

with the state variable z G M. N . We assume that the state space is decomposable according 
to z = (x, y) with xGK" and y G M m and n + m = N. Here x shall denote those variables 
for which direct observations are available, and y shall denote those variables for which only 
some integrated or statistical information is available. We will coin the former observables 
and the latter pseudo-observables. We do not incor porate model err or here and assume that 



rrx 

([Q) describes the truth. We apply the notation of llde et all (1l997l ) unless stated explicitly 
otherwise. 

Let us introduce an observation operator H : — y ~R n which maps from the whole space 
into observation space spanned by the designated variables x. We assume that observations 
of the designated variables x are given at equally spaced discrete observation times U with the 
observation interval At t, s . Since it is assumed that there is no model error, the observations 
y Q G R n at discrete times ti = zAt b s are given by 

y Q (tj) = Hz(ti) + r G , 

with independent and identically distributed (i.i.d.) observational Gaussian noise r Q G M n . 
The observational noise is assumed to be independent of the system state, and to have zero 
mean and constant covariance R Q G IR nxn . 

We further introduce an operator h : M. N — y W m which maps from the whole space into the 
space of the pseudo-observables spanned by y. We assume that the pseudo-observables have 



1 The exposition is restricted to R*, but we note that the formulation can be generalized for Hilbert 
spaces. 
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variance A c i im G K mxm anc l constant mean a c i im G M m . This is the only information available 
for the pseudo-observables, and may be estimated, for example, from climatic measurements. 
The error covariance of those pseudo-observations is denoted by R w G R mxm . 

The model forecast state Zf at each observation interval is obtained by integrating the 
state variable with the full nonlinear dynamics ([I]) for the time interval Ai b s - The back- 
ground (or forecast) involves an error with covariance P/ G R NxN . 

Data assimilation aims to find the best estimation of the current state given the forecast 
Zf with variance Pf and observations y D of the designated variables with error covariance 
R D . Pseudo-observations can be included following the standard Bayesian approach once 
their mean a c n m and error covariance R w are known. However, the error covariance R w 
of a pseudo-observation is in general not equal to A c i im . In Section [3j we will show how 
to derive the error covariance R w in order to ensure that the forecast does not exceed the 
prescribed variance A c ii m . We do so in the framework of Kalman filters and shall now briefly 
summ arize the basic ideas to construct such a filter for th e case of an e nsem ble square root 
filter (ITippett et all 120031 ) , the ensemble transform filter (I Wang et all 12004 ) . 



2.1 Ensemble Kalman filter 



In an ensemble Kalman filter (EnKF) (jEvensenl . 120061 ) an ensemble with k members z& 

Z = [ Zl ,z 2 ,...,z k ]eR Nxk 
is propagated by the full nonlinear dynamics ([IT) , which is written as 

Z = f (Z) , f (Z) = [/( Zl ), /(z 2 ), . . . , /(**)] G R Nxk . 
The ensemble is split into its mean 



(2) 



1 

z = — } Zj = Zw with w = — e G 



1=1 



1 

— c 

k 



where e = [1, . . . , 1] T G M fc , and its ensemble deviation matrix 

Z' = Z - ze T = ZT , 
with the constant projection matrix 

T = I - we T G R kxk . 

The ensemble deviation matrix Z' can be used to approximate the ensemble forecast covari- 
ance matrix via 



1 



k-1 



z\t) [z\t)Y g 



I U\1 T <- wnNxN 



Given the forecast ensemble Z/ = Z(tj — e) and the associ ated forecast error covariance 



matrix (or the prior) P/(tj — e), the actual Kalman analysis (jKalnayl . l2002t lEvensenl . 12006 
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Simonl . 120061 ) updates a forecast into a so-called analysis (or the posterior). Variables at times 



t = ti — e are evaluated before taking the observations (and/or pseudo-observations) into 
account in the analysis step, and variables at times t = + e are evaluated after the analysis 
step when the observations (and/or pseudo-observations) have been taken into account. In 
the first step of the analysis the forecast mean, 

z f = Z/w , 

is updated to the analysis mean 

z a = z f - K [Hz f - y Q ] - K w [hz f 
where the Kalman gain matrices are defined as 



a clim 



(3) 



W 



(4) 



The analysis cov ariance P rt is given by the addition rule for variances, typical in linear 
Kalman filtering (iKalnayl . 120021 ) , 



P a = (P7 1 + H^H + h T R; x h) 1 



(5) 



To calculate an ensemble Z a which is consistent with the error covariance after the observa- 
tion P a , and which therefore needs to satisfy 



1 



k-1 



Z a T [ZJ 



we use the method of en semble square root filters (Simonl. 120061) . In particular we use 
the method proposed in (ITippett et all 120031 ; IWang et all 120041 ). the so called ensemble 
transform Kalman filter (ETKF), which seeks a transformation S G R fcxfc such that 



Z' 



Z^S 



(6) 



Alternatively one could have chosen the ensemble adjustment filter (jAndersonl . 120011 ) in which 
the ensemble deviation matrix Z'j is pre-multiplied with an appropriately determined matrix 
A e M. NxN . However, since we are mainly interested in the case k N we shall use the 
ETKF. Note that the matrix S is not uniquely determined for k < N. The transformation 
matrix S ca n be obtained eithe r by using continuous Kalman filters ( iBergemann et all 120091 ) 
or directly (IWang et all 120041 ) by 

s = c(i fe + r)^c T . 

Here CTC T is the singular value decomposition of 

1 



U = ^yT T Zj (H^H + h T R^h) Z f T 
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The matrix C G M fcx ( fe_1 ) is obtained by erasing the last zero column from C G W. kxk , and 
f G R^ -1 )*^ -1 ) is the upper left (k — 1) x (k — 1) block of the diagonal matrix T G W kxk . 
The deletion of the eigenvalue and the associated columns in C assure that Z' a = Z' a S and 
therefore that the analysis mean is given by z a . Note that S is symmetric and ST = TS 
which assures that Z' a = Z' a S implying that the mean is preserved under the transformation. 
This is not necessarily true for general ensemble transform methods of the form ([6]). 

A new forecast Z(t i+1 — e) is then obtained by propagating Z a with the full nonlinear 
dynamics (TS]) to the next time of observation. The numerical results presented later in 
Sections 0] and [5] are obtained with this method. 

In the next Section we will determine how the error covariance R w used in the Kalman 
filter is linked to the variance A C H m of the pseudo-variables. 



3 Derivation of the variance limiting Kalman filter 

One may naively believe that the error covariance of the pseudo-observable R w is determined 
by the target variance of the pseudo-observables A c i im simply by setting R w = A c i im . In the 
following we will see that this is not true, and that the expression for R w which ensures that 
the variance of the pseudo-observables in the analysis is limited from above by A c ii m involves 
all error covariances. 



We formulate the Kalman filter as a minimization problem of a cost function (e.g. iKalnay 



( 120021 )). The cost function for one analysis step as described in Section [2.11 with a given 
background z/ and associated error covariance P/ is typically written as 



J{*) = ^(z-z^Pj^z-z^ + ^yo-HzfR-^yo-Hz) 
+ ^ ( a dim - hz) T R^ 1 (a clim - hz) , 



(7) 



where z is the state variable at one observation time £j = iAt ^, s . Note that the part 



(Sasaki. 


1970; 


Zupanski. 


1997: 


Neef et al. 


2006) 



The analysis step of the data assimilation procedure consists of finding the critical point 
of this cost function. The thereby obtained analysis z = z a and the associated variance P a 
are then subsequently propagated to the next observation time t i+ i to yield Zf and P/ at 
the next time step, at which a new analysis step can be performed. The equation for the 
critical point with V z J(z) = is readily evaluated to be 



(PJ 1 + H^H + h^h) z a = Pfz f + I^R-Vo + h^aelim 



(8) 



and yields ([3]) for the analysis mean z a , and (jSJ) for the analysis covariance P a with Kalman 
gain matrices given by (j3J). 

To control the variance of the unresolved pseudo-observables a c i im = hz we set 



hP n h T = A 



clim 



(9) 
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Introducing 

V 1 = P] 1 + H T R- l H, (10) 
and up on applying the Sherman- Morrison- Woodbury formula (see for example Golub and Loan! 



(119961 )) to ("P 1 + h T R w 1 h) 1 , equation ([9]) yields the desired equation for R v 

^^-(hn 3 )- 1 , (ii) 

which is yet again a reciprocal addition formula for variances. Note that the naive expec- 
tation that R w = A c i im is true only for P/ — > oo, but is not generally true. For sufficiently 
small background error covariance P/, the error covariance R w as defined in ( ITT]) is not 
positive semi-definite. In this case the information given by the pseudo-observables has to 
be discarded. In the language of variational data assimilation the criterion of positive defi- 
niteness of R" 1 determines whether the weak constraint is switched on or off. To determine 
those eigendirections for which the statistical information available can be incorporated, we 
diagonalize R^ 1 = VDV T and define D with Djj = Da for Da > and D# = for Da < 0. 
The modified R" 1 = VDV T then uses information of the pseudo-observables only in those 
directions which potentially allow for improvement of the analysis. Noting that "P denotes 
the analysis covariance of an ETKF (with R w = 0), we see that equation (fTT|) states that 
the variance constraint switches on for those eigendirections whose corresponding singular 
eigenvalues of hPh 7 " are larger than those of A c i im . Hence the proposed VLKF as defined 
here incorporates the climatic information of the unresolved variables in order to restrict the 
posterior error covariance of those pseudo-observables to lie below their climatic variance 
and to drive the mean towards their climatological mean. 



4 Analytical linear toy model 

In this Section we study the VLKF for the following coupled linear skew product system for 
two oscillators x 6 I 2 , y 6 R 2 

dx = Ax dt — r^x dt + er x dW t + Ay dt 
dy = By dt — T y y dt + a y dB t , 

where A, B and A are all skew-symmetric, cr x ,y an d F x „ are all symmetric, and d\V t and 
dB t are independent two-dimensional Brownian processes^. We assume here for simplicity 
that 

Tec 'fx^- j Ty ) &x 0*1 1 j Gy ^J/I j Ro -^obsl ; 

with the identity matrix I, and 

A = u x J , B = UyJ , A = A J , 

2 We will use bold font for matrices and vectors, and non-bold font for scalars here. It should be clear 
from the context whether bold fonts refer to a matrix or a vector. 
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with the skew-symmetric matrix 

J 



-1 

1 



Note that our particular choice for the matrices implies R w = -R^I. 

The system models two noisy coupled oscillators, x and y. We assume that we have access 
to observations of the variable x at discrete observation times £j = iAt^, but have only 
statistical information about the variable y. We assume knowledge of the climatic mean fj, c u m 
and the climati c covariance <x3 jm of the unobserved variable y. The noise is of Ornstein- 



Uhlenbeck type (jGardinerl . 120031 ) . and may represent either model error or parametrize highly 
chaotic nonlinear dynamics. Without loss of generality, the coupling is chosen such that the 
y-dynamics drives the x-dynamics but not vice versa. The form of the coupling is not 
essential for our argument, and it may be oscillatory or damping with A = AI. We write 
this system in the more compact form for z = (x, y) G M 4 

dz = Mzdt-Tzdt + crdWt + Czdt (12) 

with 



M 



A 








B 


cr x 








<Jy 



c 



T x 

Ty 

A 




The solution of ( 1T21) can be obtained using Ito's formula and, introducing the propagator 
L(t) = exp {{M — r + C)t), which commutes with cr for our choice of the matrices, is given 
by 



z(t) = L(t)z + cr [ L(t-s) dW s 
Jo 



with mean 

H(t) = L(t)z , 

and covariance 

E(t) = <r (2r - C)' 1 (I - exp (- (2T - C) t)) <t t , (13) 

where 



C 



A 

-A 
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The climatic mean fj, c n m G M 4 and covariance matrix S C H m G M 4x4 are then obtained in the 
limit t — » oo as 

H chm = lim fj,(t) = , 

t— >oo 

and 

S clim = lim £(t) = cr (2r - C) _1 <x T . 

t— s-oo 

In order for the stochastic process ( fl2l) to have a stationary density and for £(£) to be a 
positive definite covariance matrix for all t, the coupling has to be sufficiently small with 
A 2 < 47^.7^. Note that the skew product nature of the system (fT2"l) is not special in the sense 
that a non-skew product structure where x couples back to y would simply lead to a renor- 
malization of C. However, it is pertinent to mention that although in the actual dynamics 
of the model ( II 2p there is no back-coupling from x to y, the Kalman filter generically intro- 
duces back-coupling of all variables through the inversion of the covariance matrices (cf. (jSJ)). 

We will now investigate the variance limiting Kalman filter for this toy model. In particular 
we will first analyze under what conditions R w is positive definite and the variance constraint 
will be switched on, and second we will analyze when the VLKF yields a skill improvement 
when compared to the standard ETKF. 

We start with the positive definiteness of R w . When calculating the covariance of the 
forecast in an ensemble filter we need to interpret the solution of the linear toy model (fl2|) 
as 

d f At obs 

Zj(t i+1 ) = L(At obB )zj(ti) + a L(At ohs - s)dW js j = 1, 2, • ■ ■ , k , 

Jo 

where Zj(t i+ i) is the forecast of ensemble member j at time t i+ i = ti + At obs = (i + l)At obs 
before the analysis propagated from its initial condition Zj(ti) = z a (ti) + £j with £j ~ 
jV(0, P a (£j)) at the previous analysis. The equality here is in distribution only, i.e. members 
of the ensemble are not equal in a pathwise sense as their driving Brownian will be different, 
but they will have the same mean and variance. The covariance of the forecast can then be 
obtained by averaging with respect to the ensemble and with respect to realizations of the 
Brownian motion, and is readily computed as 

P f (t i+1 ) = L(At ohs ) P a (U) L T (At ohs ) + £(At obs ) , (14) 

where L T {t) = exp ((— M — T + C T ) t) denotes the transpose of Lit). The forecast covari- 
ance of an ensemble with spread P a is typically larger than the forecast covariance X! of one 
trajectory with a non-random initial condition z . The difference is most pronounced for 
small observation intervals when the covariance of the ensemble P f will be close to the initial 
analysis covariance P a , whereas a single trajectory will not have acquired much variance S. 
In the long-time limit, both, Pf and S, will approach the climatic covariance S c n m (cf. ( TT3]) ). 
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In the following we restrict ourselves to the limit of small observation intervals At ^ s <C 1. 
In this limit, we can approximate P a (^) ~ P/(^+i) and explicitly solve the forecast co- 
variance matrix Pf using (1141) . This assumption requires that the analysis is stationary in 
the sense that the filter has lost its memory of its initial background covariance provided 
by the user to start up the analysis. We have verified the validity of this assumption for 
small observation intervals and for a range of initial background variances. This assump- 
tion renders f[T4"|) a matrix equation for Pf. To derive analytical expressions we further 
Taylor-expand the propagator L(At ^ s ) and the covariance X(At b s ) for small observation 
intervals At obs . This is consistent with our stationarity assumption P a (ti) ~ P/(t i+ i). The 
very lengthy analytical expressio n for P/(tj + i) can be obtained with the aid of Mathematica 



(IMathematica Version 7.0 



20081 ). but is omi tted from this paper. 



In filtering one often uses variance inflation (lAnderson and Anderson! Il999l ) to compensate 



for the loss of ensemble variance due to finite size effects, sampling errors and the effects 
of nonlinearities. We do so here by introducing an inflation factor 5 > 1 multiplying the 
forecast variance P/. Having determined the forecast covariance matrix Pf we are now 
able to write down an expression for the error covariance of the pseudo-observables R w . As 
before we limit the variance and the mean of our pseudo-observable y to be A clim = cr 2 lim 
and a clim = /-i c n m . Then, upon using the definitions ffTUj) and ffTTl) . we find that the error co- 
variance for the pseudo-observables R w is positive definite provided the observation interval 
Ai bs is sufficiently larg^E Particularly, in the limit of R D — > oo, we find that if 



At ° taW > 27,(1 • (15) 



the variance constraint will be switched on. Note that for 5 > 1 the critical At b s above 
which R w is positive definite can be negative, implying that the variance constraint will 

be switched on for all (positive) values of At b s - If no inflation is applied, i.e. 5 = 1, this 
simplifies to 

A 2 

Ai bs > ; 7^r > . (16) 

2 7 ,(l + 7y 2 ) 1 ' 

Because ^jxly — A 2 > the critical observation interval At Q b s is smaller for non-trivial in- 
flation with 5 > 1 than if no variance inflation is incorporated. This is intuitive, because 
the variance inflation will increase instances with |hP a h T | > |cr 2 lim |. We have numerically 
verified that inflation is beneficial for the variance constraint to be switched on. It is perti- 
nent to mention that for sufficiently large coupling strength A or sufficiently small values of 
7a:, Equation ffTB"]) may not be consistent with the assumption of small observation intervals 
At obs < 1. 

We have checked analytically that the derivative of R" 1 is positive at the critical observa- 
tion interval At b s , indicating that the frequency of occurrence when the variance constraint 

3 We actually compute R^ 1 , however, since R w is diagonal for our choice of the matrices, positive defi- 
niteness of R~ 1 implies positive definiteness of R w . 
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is switched on increases monotonically with the observation interval At b s , in the limit of 
small At bs- This has been verified numerically with the application of VLKF for (j!2p and 
is illustrated in Figure [TJ 

At this stage it is important to mention effects due to finite size ensembles. For large obser- 
vation intervals At b s —> oo and large observational noise R Q — > oo, we have P/ — > S c i im and 
our analytical formulae would indicate that the variance constraint should not be switched 
on (cf. ffTUj) and ffTTT) ). However, in numerical simulations of the Kalman filter we observe 
that for large observation intervals the variance constraint is switched on for almost all anal- 
ysis times. This is a finite ensemble size effect and is due to the mean of the forecast variance 
ensemble adopting values larger than the climatic value of cr clim implying positive definite 
values of R w . The closer the ensemble mean approaches the climatic variance, the more 
likely fluctuations will push the forecast covariance above the climatic value. However, we 
observe that the actual eigenvalues of R w decrease for At obs — > oo and for the size of the 
ensemble k — > oo. 



The analytical results obtained above are for the ideal case with k — > oo. As mentioned 
in the introduction, in sparse observation networks finite ensemble sizes cause the overes- 
timation of error covariances ( ILiu et all 120081 ; IWhitaker et all 120091 ). implying that R w is 
positive definite and the variance limiting constraint will be switched on. This finite size 
effect is illustrated in Figure |2j where the maximal singular value of hP a h T , averaged over 
50 realizations, is shown for ETKF as a function of ensemble size k for different observa- 
tional noise variances. Here we used no inflation, i.e. 5 = 1, in order to focus on the effect 
of finite ensemble sizes. It is clearly seen that the projected covariance decreases for large 
enough ensemble sizes. The variance will asymptote from above to hS c i im h T in the limit 
k oo. For sufficiently small observational noise, the filter corrects too large forecast error 
covariances by incorporating the observations into the analysis leading to a decrease in the 
analysis error covariance. 



However, the fact that the variance constraint is switched on does not necessarily imply that 
the variance limiting filter will perform better than the standard ETKF. In particular, for 
very large observation intervals At Q b s when the ensemble will have acquired the climatic mean 
and covariances, VLKF and ETKF will have equal skill. We now turn to the question under 
what conditions VLKF is expected to yield improved skill compared to standard ETKF. To 
this end we introduce as skill indicator the (squared) RMS error 

£ = E t > dW \\z a {t l )-z tvuth (t i )\\ 2 G , (17) 

between the truth z truth and the ensemble mean analysis z a (the square root is left out 
here for convenience of exposition). Here E* denotes the temporal average over analyzes 
cycles, and K dW denotes averaging over different realizations of the Brownian paths dW . 
We introduced the norm 1 1 ab 1 1 g = a T Gb to investigate the overall skill using G = I, the 
skill of the observed variables using G = H T H and the skill of the pseudo-observables using 
G = h T h. Using the Kalman filter equation fl3]) for the analysis mean with K w = 0, we 
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Figure 1: Proportion of incidences when the variance constraint is switched on and R w is 
positive definite as a function of the observation interval At Q ^ s for the stochastic linear toy 
model ( TT2"j) . We used ^ x — 1, j y — 1, a x = 1, a y = 1, X = 0.2. We used k = 20 ensemble 
members, 100 realizations and R Q = HS c ii m H T and no inflation with 5 = 1. The analytically 
calculated critical observation interval according to equation ffT6l) is At Q b s = 10~ 2 . 
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Figure 2: Average maximal singular value of hP a h T as a function of ensemble size k for the 
stochastic linear toy model (fl2l) using standard ETKF without inflation, with R Q = 0.25 
(dashed curve) and R Q = 2 (solid curve). Parameters are a x = a y = j x = j y = 1, A = 0.2, 
At bs = 1, for which the climatic variance is h£h T m 0.505. We used 50 realizations for the 
averaging. 
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obtain for the ETKF 

£ E tkf = E ^| )(I _ Ko H)(z / (t t ) - z truth (^)) + K r (^)||^ . 

Solving the linear toy-model f|T2|) for each member of the ensemble and then performing an 
ensemble average, we obtain 

z / (t i ) = i(At obB )z a (t i _ 1 ). (18) 

Substituting a particular realization of the truth z trut h(t), and performing the average over 
the realizations, we finally arrive at 

£Etkf = E tp _ Ko H)L(At obs ) £ ti _J G + E*||(I - K H)r7 ti ||^ + E'||K r || G , (19) 
with the mutually independent normally distributed random variables 

£ ti = Z a (U) - Ztruth(*i) ~ A/"(0, P a (U)) 

Vu = * f H&t ohs - s) dW s ~ JV(0, S(At obs )) 
Jti-i 

r G ~ Af(0, Ro) . (20) 

We have numerically verified the validity of our assumptions of the statistics of £ tj and rj ti ■ 
Note that for £ f . to have mean zero and variance P a (£j) filter divergence has to be excluded. 
Similarly we obtain for the VLKF 

£ vlkf = E *|| (I _ Ko H)L(AU s ) + - K H)r7 ti || G + E'||K r || G 

+ E*||K w h0J| G + 2E'[(I-K o H) J L(At obs )^_ 1 ) T G(K w h0J] , (21) 
with the normally distributed random variable 

C* = 2/(ti)~^'(0,ip/(t i )) l (22) 

where we used that a c i im = 0. Note that using our stationarity assumption to calculate 
Pf we have ~ Again we have numerically verified the statistics for £ ti . The 

expression for the RMS error of the VLKF (12T|) can be considerably simplified. Since for 
large ensemble sizes k — > oo the random variable ( t . becomes a deterministic variable with 
mean zero, we may neglect all terms containing ( ti . We summarize to 

£ ylkf = E t|| (I _ Ko H)X(At obs ) ^ || G + E'||(I - K H)^|| G + E'||K r || G . (23) 

For convenience we have omitted superscripts for K Q and in (fT9l and fl23|) to denote 
whether they have been evaluated for ETKF and VLKF. But note that, although the ex- 
pressions ( lT9j) and ( 1231) are formally the same, one generally has £ ETKF ^ S VLKF , because 
the analysis covariance matrices P a are calculated differently for both methods leading to 
different gain matrices K Q and different statistics of $, t in (1T9~1) and (1231) . 
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We can now estimate the skill improvement defined as 

g _ £ETKF /gVLKF 

with values of S > 1 indicating skill improvement of VLKF over ETKF. We shall choose 
G = h T h from now on, and concentrate on the skill improvement for the pseudo-observables. 
Recalling that £ ETKF & £ VLKF for large observation intervals At t, s , we expect skill improve- 
ment for small At b s . We perform again a Taylor expansion in small At a b s of the skill 
improvement S. The resulting analytical expressions are very lengthy and cumbersome, and 
are therefore omitted for convenience. 

We found that there is indeed skill improvement S > 1 in the limit of either ^ y — > oo or 
j x — > 0. This suggests that the skill is controlled by the ratio of the time scales of the observed 
and the unobserved variables. If the time scale of the pseudo-observables is much larger than 
the one of the observed variables, VLKF will exhibit superior performance over ETKF. This 
can be intuitively understood since l/(2j y ) is the time scale on which equilibrium - i.e. 
the climatic state - is reached for the pseudo-observables y. If the pseudo-observables have 
relaxed towards equilibrium within the observation interval At b s , and their variance has 
acquired the climatic covariance hP a h T = cr^ lim , we expect the variance limiting to be ben- 
eficial. 



Furthermore, we found analytically that the skill improvement increases with increas- 
ing observational noise i? Q b s (at least in the small observation interval approximation). In 
particular we found that dS/dR Q i, s > at -R D b s = 0. The increase of skill with increasing 
observational noise can be understood phenomenologically in the following way. For i? Q b s = 
the filter trusts the observations, which as a time series carry the climatic covariance. This 
implies that there is a realization of the Wiener process such that the analysis can be re- 
produced by a model with the true values of j Xjy and a XjV . Similarly, this is the case in 
the other extreme _R b s —> oo, where the filter trusts the model. For <C i? b s «C oo the 
analysis reproducing system would have a larger covariance a x than the true value. This 
slowed down relaxation towards equilibrium of the observed variables can be interpreted as 
an effective decrease of the damping coefficient j x . This effectively increases the time scale 
separation between the observed and the unobserved variables, which was conjectured above 
to be beneficial for skill improvement. 

As expected, the skill improves with increasing inflation factor 5 > 1. The improvement 
is exactly linear for At b s — > 0. This is due to the variance inflation leading to an increase 
of instances with hP a h T > cr^ lim , for which the variance constraint will be switched on. 

In Figure [3] we present a comparison of the analytical results (fl9j) and fl23l) with results 
from a numerical implementation of ETKF and VLKF for varying damping coefficient j y . 
Since -y y controls the time-scale of the y-process, we cannot use the same At b s for a wide 
range of 7^ in order not to violate the small observation interval approximations used in our 
analytical expressions. We choose At obs as a function of 7^ such that the singular values 
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of the first-order approximation of the forecast variance is a good approximation for this 
At obs . For Figure [3] we have At obs G (0.005,0.01) to preserve the validity of the Taylor 
expansion. Besides the increase of the skill with j y , Figure El shows that the value of S 
increases significantly for larger values of the inflation factor 5 > 1. 

We will see in the next Section that the results we obtained for the simple linear toy 
model ffl~2]) hold as well for a more complicated higher- dimensional model, where the dynamic 
Brownian driving noise is replaced by nonlinear chaotic dynamics. 





Figure 3: Dependency of the skill improvement S of VLKF over ETKF on the damping 
coefficient 7 y of the pseudo-observable. We show a comparison of direct numerical simulations 
(open circles) with analytical results using (|2~TT) (continuous curve) and the approximation 
of large ensemble size (123]) (dashed curve). Parameters are ^ x — 1, A = 2, a x — a y — 1, 
-Robs — 0.25. We used an ensemble size of k = 20 and averaged over 1000 realizations. Left: 
no inflation with 5 = 1. Right: Inflation with 5 = 1.02 2 . 



5 Numerical results for the Lorenz-96 system 



We illustrate our method with the Lorenz-96 system ( Lorenzl . 19961 ) and show its us efulness 
for sp arse observations in improving the analysis skill and stabilizing the filter. In ( ILorenzl . 
19961 ) Lorenz proposed the following model for the atmosphere 



Zi-l[Z; 



i+1 



Zi-2 



Zi + F 



D 



(24) 



with z = (zi,--- ,z D ) and periodic z i+D = Z{. This system is a toy-model for midlati- 
tude atmospheric dynamics, incorporating linear damping, forcing and nonlinear transport. 
The dynamical properties o f the Lorenz-96 system have been investigated, for examp le, in 
( iLorenz and Emanuell . Il998l : lOrrell and Smith! l2003t iGottwald and Melbourne!. 120051 ) , an d 



in the context of data assimilation it was i nvestigated in, for example, (lOtt et all I2004J : 
Fisher et all 120051 ; lHarlim and Majdal . |2010| ). We use D = 40 mo des and set th e forcing to 
F = 8. These parameters correspond to a strongly chaotic regime (ILorenzl . Il996l ). For these 
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parameters one unit of time corresponds to 5 days in the earth's atmosphere as calculated 
by calibrating the e-folding ti me of the asym ptotic growth rate of the most unstable mode 



with a time scale of 2.1 days (ILorenzl . I1996T ). Assuming the length of a midlatitude belt 



to be about 30, 000km, the spatial scale corresponding to a discretization of the circumfer- 
ence of the earth along the midlatitudes in D = 40 grid points corresponds to a spacing 
between adjacent grid points Zi of approximately 750km, roughly equalling the Rossby ra- 
dius of deformation at midlatitudes. We estimated from simulations the advection velocity 
to be approximately 10.4 m/sec which compares well with typical wind velocities in the 
midlatitudes. 

In the following we will investigate the effect of using VLKF on improving the analysis skill 
when compared to a standard ensemble transform Kalman filter, and on stabilizing the filter 



and avoiding blow-up as discussed in ( lOtt et all \2004 . iKepertl . l2004t lHarlim and Majda 



2010l ). We perform twin experiments using a k = 41-member ETKF and VLKF with the 
same truth time series, the same set of observations and the same initial ensemble. We have 
chosen an ensemble with k > D in order to eliminate the effect that a finit e-size ensembl e 
can only fit as many observations as the number of its ensemble members ( ILorend . 120031 ). 
Here we want to focus on the effect of limiting the variance. 



Th e system is integrated using the implicit mid-point rule (see for example iLeimkuhler and Reich 
( 120051 )) to a time T = 30 with a time step dt = 1/240. The total time of integration cor- 
responds to an equivalent of 150 days, and the integration timestep dt corresponds to half 
an hour. We measured the approximate climatic mean and variance, //dim and of lim , re- 
spectively, via a long time integration over a time interval of T = 2000 which corresponds 
roughly to 27.5 years. Because of the symmetry of the system ( 1241 . the mean and the stan- 
dard deviation are the same for all variables Z{, and are measured to be cr clim = 3.63 and 
/^ciim = 2.34. 

The initial ensemble at t = is drawn from an ensemble with variance of lim ; the filter was 
then subsequently spun up for sufficiently many analysis cycles to ensure statistical station- 
arity. We assume Gaussian observational noise of the order of 25% of the climatological stan- 
dard deviation <J c u m , and set the observational error covariance matrix R Q = (0.25a c u m ) 2 I. 
We find that for larger observational noise levels the variance limiting correction (iTTj) is used 
more frequently. This is in accordance with our finding in the previous section for the toy 
model. 

We study first the performance of the filter and its dependence on the time between 
observations At obs and the proportion of the system observed l/iV obs . N ohs = 2 means only 
every second variable is observed, iV bs = 4 only every fourth, and so on. 

We have used a constant variance inflation factor 5 = 1.05 for both filters. We note that 
the optimal inflation factor at which the RMS error S is minimal, is different for VLKF and 
ETKF. For At f, s = 5/120 (5 hours) and N Q b s = 4 we find that 5 = 1.06 produces minimal 
RMS errors for VLKF and 5 = 1.04 produces minimal RMS errors for ETKF. For 5 < 1.04 
filter divergence occurs in ETKF, so we chose 5 = 1.05 as a compromise between controlling 
filter divergence and minimizing the RMS errors of the analysis. 

Figure H] shows a sample analysis using ETKF with N D b s = 5, At G b s = 0.15 and R G = 
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(0.25<j c / im ) 2 1 for an arbitrary unobserved component (top panel) and an arbitrary observed 
component (bottom panel) of the Lorenz-96 model. While the figure shows that the analysis 
(continuous grey line) tracks the truth (dashed line) reasonably well for the observed compo- 
nent, the analysis is quite poor for the unobserved component. Substantial improvements are 
seen for the VLKF when we incorporate information about the variance of the un-observed 
pseudo-observables, as can be seen in Figure [5j We set the mean and the variance of the 
pseudo-observables to be the climatic mean and variance, a clim = // r 1iTT 1 e and A c i im = <7 2 irri I 
to filter the same truth with the same observations as used to produce Fig. |H For these 
parameters (and in this realization) the quality of the analysis in both the observed and 
unobserved components is improved. 




Figure 4: Sample ETKF analysis (continuous grey line) for observed z 5 (bottom panel) and 
unobserved Z\ (top panel) component. The dashed line is the truth, the crosses are observa- 
tions. Parameters used were N obs = 5, At obs = 0.15 (18 hours) and R Q = (0.25cr c / im ) 2 I. 

As for the linear toy model (fl2|) . finite ensemble sizes exacerbate the overestimation of 
error covariances. In Figure the maximal singular value of hP a h T , averaged over 150 re- 
alizations, is shown for ETKF as a function of ensemble size k. Again we use no inflation, 
i.e. 5 = 1, in order to focus on the effect of finite ensemble sizes. The projected covariance 
clearly decreases for large enough ensemble sizes. However, here the limit of the maximal 
singular value of hP a h T for k — > oo underestimates the climatic variance 0" 2 lim = 13.18. 
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Figure 5: Sample VLKF analysis (continuous grey line) for observed (bottom panel) 
and unobserved Z\ (top panel) component. The dashed line is the truth, the crosses are 
observations. Parameters as in Figure HI 



To quantify the improvement of the VLKF filter we measure the site-averaged RMS error 



(j-p l|za( /At °bs) - z truth (/At obs )|| 2 ) (25) 
° 1=1 

between the truth z tru th and the ensemble mean z a with L = IT/At^l where the average 
is taken over 500 different realizations, and D Q < D denotes the length of the vectors z a . 
In tables [T] we display £ for the ETKF and VLKF respectively, as a function of N obs and 
At bs- The increased RMS error for larger observation intervals At ^ s can be linked to the 
increased variance of the chaotic nonlinear dynamics generated during longer integration 
times between analyses. Figure [7J shows the average proportional improvement of the VLKF 
over ETKF, obtained from the values of tables HJ Figure [7] shows that the skill improvement 
is greatest when the system is observed frequently. For large observation intervals At b s 
ETKF and VLKF yield very similar RMS. We checked that for large observation intervals 
Ai bs both filters still produce tracking analyses. Note that the observation intervals At Q b s 
considered here are all much smaller than the e-folding time of 2.1 days. The most significant 
improvement occurs when one quarter of the system is observed, that is for N b s = 4, and for 
small observation intervals At b s - The dependency of the skill of VLKF on the observation 
interval is consistent with our analytical findings in Section HI 

We have tested that the increase in skill as depicted in Figure [7J is not sensitive to 
incomplete knowledge of the statistical properties of the pseudo-observables by perturbing 
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A c i im and a c i im and then monitoring the change in RMS error. We performed simulations 
where we drew A c i im and a clim independently from uniform distributions (0.9 A c i im , 1.1 A clim ) 
and (0.9 a clim , 1.1 a c i im ). We found that for parameters N obs — 2,4,6, rj = 0.05,0.25,0.5 
(with rj measuring the amount of the climatic variance used through R D = (r/ a c u m ) 2 1), and 
At bs = 0.025, 0.05, 0.25 (corresponding to 3, 6 and 30 hours) over a number of simulations 
there was on average no more than 7% difference of the analysis mean and the singular values 
of the covariance matrices between the control run where A c i im = cx^ lim I and a clim = /i c ii m e 
is used, and when A c i im and a c i im are simultaneously perturbed. 

An interesting question is how the relative skill improvement is distributed over the 
observed and unobserved variables. This is illustrated in Figure [8] and Figure [9j In Figure [8] 
we show the proportional skill improvement of VLKF over ETKF for the observed variables 
and the pseudo-observables, respectively. Figure M shows that the skill improvement is larger 
for the pseudo-observables than for the observables which is to be expected. In Figure M we 
show the actual RMS error £ for ETKF and VLKF for the observed variables and the pseudo- 
observables. It is shown that the skill improvement is better for the unobserved pseudo- 
observables for all observation intervals At b s - In contrast, VLKF exhibits an improved skill 
for the observed variables either for small observation intervals for all values of iV bs or for 
all (sufficiently small) observation intervals when iV bs = 4, 5. We have, however, checked 
that the analysis is still tracking the truth reasonably well, and the discrepancy with ETKF 
is not due to the analysis not tracking the truth anymore. As expected, the RMS error 
asymptotes for large observation intervals At b s (not shown) to the standard deviation of 
the observational noise 0.25<r c ii m ~ 0.91 for the observables, and to the climatic standard 
deviation cr C H m = 3.63 for the pseudo-observable (not shown), albeit slightly reduced for 
small values of iV b s due to the impact of the surrounding observed variables (see Figure [TO]) . 

Note that there is an order of magnitude difference between the RMS errors for the 
observables and the pseudo-observables for large iV bs (cf. Figures |9j. This suggests that the 
information of the observed variables does not travel too far away from the observational 
sites. However, the nonlinear coupling in the Lorenz-96 system ( 12"4"|) allows for information 
of the observed components to influence the error statistics of the unobserved components. 
Therefore the RMS error of pseudo-observables adjacent to observables are better than those 
far away from observables. Moreover, the specific structure of the nonlinearity introduces 
a translational symmetry-breaking (one may think of the nonlinearity as a finite difference 
approximation of an advection term zz x ), which causes those pseudo-observables to the right 
of an observable to have a more reduced RMS error than those to the left of an observable. 
This is illustrated in Figure ITUl where the RMS error is shown for each site when only one site 
is observed. The advective time scale of the Lorenz-96 system is much smaller than At Q b s 
which explains why the skill is not equally distributed over the sites, and why, especially 
for large values of iV b s , we observe a big difference between the site-averaged skills of the 
observed and unobserved variables. 

In Figure [TTJ we show how the RMS error behaves as a function of the observational noise 
level. We see that for iV bs = 4 VLKF always has a smaller RMS error than ETKF. 
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Table 1: RMS errors for ETKF (upper table) and VLKF (bottom table), averaged over 500 
simulations, and with R D = (0.25cr c /j m ) 2 1 as observational noise. 




Figure 6: Average maximal singular value of hP a h T as a function of ensemble size k for the 
Lorenz-96 model (124"]) . using standard ETKF without inflation. All other parameters are as 
in Figure 4. We used 150 realisations for the averaging. 
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Figure 7: Proportional skill improvement of VLKF over ETKF as a function of the observa- 
tion interval At obs for different values of iV obs , with observational noise R Q = (0.25cr c / im ) 2 I. 
A total of 500 simulations were used to perform the ensemble average in the RMS errors 8 
using ( 125]) for ETKF and VLKF. At obs is measured in hours. 
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Figure 8: Proportional skill improvement of VLKF over ETKF as a function of the observa- 
tion interval At obs for different values of N ohs . The RMS error £ is calculated using only the 
observed variables (left) or only the pseudo-observables (right). At obs is measured in hours. 
Parameters as in Figure [7J 
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Figure 9: RMS error of VLKF (solid lines) and ETKF (dashed lines) for R G = (0.25a cHm ) 2 I, 
where £ is calculated using only the observed variables (left) or only the pseudo-observables 
(right). At obs is measured in hours. Parameters as in Figure [3 




Figure 10: RMS error 8 for each variable Z{ as a function of the lattice site i. Only one 
observable was used at i = 21. Time between observations is At obs = 10 hours and observa- 
tional noise with covariance R Q = (0.25 cr^ Hm )l was used. The results are averaged over 100 
different realizations. 
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The results confirm again the results from our analysis of the toy model in Section HJ 
that VLKF yields best performance for small observation intervals At b s and for large noise 
levels. For large observation intervals ETKF and VLKF perform equally well, since then the 
chaotic model dynamics will have lead the ensemble to have acquired the climatic variance 
during the time of propagation. 



In ( IQtt et all 12004 ) it was observed that if not all variables z% are o bserved the Kalman 



filter dive rges exhibiting b low-up. Similar behaviour was observed in (lHarlim and Majda 



20101 ) . In (jOtt et all 120041 ) the authors suggested that the sparsity of observations leads to an 
inhomogeneous background error, which causes an underestimation of the error covariance. 
We study here this catastrophic blow-up divergence (as opposed to filter divergence when 
the analysis diverges from the truth) and its dependence on the time between observations 
At obs and the proportion of the system observed l/A^ obs . We note that blow-up divergence 
appears only in the case of sufficiently small observational noise and moderate values of 
Atobs- Once At Q b s is large enough (in fact, larger than the e-folding time corresponding to 
the most unstable Lyapunov exponent, in our case 2.1 days) we notice that no catastrophic 
divergence occurs, independent of iV bs. This probably occurs because for large observation 
intervals the ensemble acquires enough variance through the nonlinear propagation. We 
prescribe Gaussian observational noise of the order of 5% of the climatological standard 
deviation a c u m , and set the observational error covariance matrix to R Q = (0.05 a c iim) 2 I- 
The initial ensemble at t = is drawn again from an ensemble with variance of lim . 

To study the performance of VLKF when blow-up occurs in ETKF simulations we count 
the number iV& of blow-ups that occur before a total of 100 simulations have terminated 
without blow-up. The proportions of blow-ups for the respective filters is then given by 
Nb/ (Nb + 100). We tabulate this proportion in tables [2] for the ETKF and VLKF respectively 
and the proportional improvement in table [3j The 'x's' in the table represent cases where 
no successful simulations could be obtained due to blow-up. 

Both filters suffer from severe filter instability for iV bs = 6, i.e. for very sparse obser- 
vational networks, at small observation intervals At obs . No blow-up occurs for either filter 
when every variable is observed. Note the reduction in occurrences of blow-ups for large 
observation intervals At Q b s as discussed above. We have checked that for all iV bs there is 
no blow-up for ETKF (and VLKF) for sufficiently large At b s (not shown); the larger A^bs 
the smaller the upper bound of At obs such that no blow-ups occur. Collapse is most promi- 
nent for ETKF (and for VLKF, but to a much lesser extent) for larger values of iV b S and 
at intermediate observation intervals which depend on iV b s . Tables |2] and |3] clearly show 
that incorporating information about the pseudo-observables strongly increases the stability 
of the filter and suppresses blow-up. However, we note that despite the gain in stability 
VLKF has a skill less than the purely observational skill in the cases when blow-up occurs 
for ETKF, because the solutions become non-tracking. Further research is under way to 
improve on this in the VLKF framework. 

The fact that incorporating information about the variance of the un-observed variables 
improves the stability of the filter is in accordance with the interpretation of filter divergence 
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Figure 11: RMS error £ of VLKF (solid lines) and ETKF (dashed lines), as a function of 
the observational noise, measured here by rj defined via R D = (r]a c u m ) 2 I. The dotted line 
indicates the RMS error if only observations were used. We show results from top to bottom 
for several observation intervals: At Q b s = 1 hour, At Q b s = 2 hours and At a b s = 5 hours, 
-^obs — 4 was used and 1000 simulations were carried out to perform the ensemble averages 
in the RMS errors £ using §2§§ for ETKF and VLKF. 
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Table 2: Proportion of catastrophically diverging simulations with ETKF (upper table) and 
VLKF (lower table). Observational noise with R Q = (0.05cr c ; im ) 2 1 was used. 
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Table 3: Proportional improvement of VLKF and ETKF as calculated as the ratio of the 
values from tables El 
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of sparse observational networks provided in (jOtt et al. 



20041 ) . 



6 Discussion 



We have developed a framework to include information about the variance of unobserved 
variables in a sparse observational network. The filter is designed to control overestimation 
of error covariances typical in sparse observation networks, and limits the posterior analysis 
covariance of the unresolved variables to stay below their climatic variance. We have done so 
in a variational setting and found a relationship between the error covariance of the variance 
constraint R w and the assumed target variance of the unobserved pseudo-observables A c i im . 

We illustrated the beneficial effects of the variance limiting filter in improving the analy- 
sis skill when compared to the standard ensemble square root Kalman filter. We expect the 
variance limiting constraint to improve data assimilation for ensemble Kalman filters when 
finite size effects of too small ensemble sizes overestimate the error covariances, in particular 
in sparse observational networks. In particular we found that the skill will improve for small 
observation intervals At Q b s and sufficiently large observational noise. We found substantial 
skill improvement for both observed and unobserved variables. These effects can be under- 
stood with a simple linear toy model which allows for an analytical treatment. We further 
established numerically that VLKF reduces the probability of catastrophic filter divergence 
and improves the stability of the filter when compared to the standard ensemble square root 
Kalman filter. 



We remark that the idea of the variance limiting Kalman filter is not restricted to en- 
semble Kalman filters but can also be used to modify the extended Kalman filter. However, 
for the examples we used here the nonlinearities were too strong and the extended Kalman 
filter did not yield satisfactory results, even in the variance limiting formulation. 



The effect of the variance limiting filter to control unrealistically large error covariances 
of the poorly resolved variables due to finite ensemble sizes may find useful applications. We 
mention here that the variance constraint is able to adaptively damp unrealistic excitation 
of ensemble spread in underesolved spatial regions due to inappropriate uniform inflation. 
This may be an alternative to th e spatially adaptive schemes which were recently developed 
( lAndersonl . 120071 ; iLi et a/.[ 120091 ). In addition, it is known that loc alization of covariance ma- 



trices i n EnKF l eads t o imbalance in the analyzed fields (see, e.g.. iHoutekamer and Mitchell 



( 120051 ); iKepertl (120091 ) for a recent study). Filter localization typically excites unwanted 
gravity waves which when uncontrolled can substantially degrade filter performance. One 
may construct balance constraints as pseudo-observations and thereby potentially reduce 
this undesired aspect of covariance localization. As more specific applications, we mention 
climate reanalysis and data assimilation for the mesosphere. It would be interesting to see 
how the proposed variance limiting filter can be used in climate reanalysis schemes to deal 
with the vertical sparcity of observational data and the less dense observation network on 
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the southern hemisphere in the pre-radiosonde era (see IWhitaker et al. One would 



need to establish though whether the historical observation intervals At Q b s are sufficiently 
small to allow for a skill improvement. Similarly, it may help to control the dynamically 
dominant gravity wave activity in the me s osphe re as the upper lid is pushed further and 
further (see for example Polavarapu et al. ( 20051 )). However, a word of caution is required 



here. In some atmospheric data assimilation problems, it is not at all uncommon to have 
an ensemble prior variance for certain variables that is significantly larger than the clima- 
tological variance, when the atmosphere is locally far away from equilibrium. One relevant 
example would be in the vicinity of strong fronts over the southern ocean. In such a case, it 
may not be appropriate to limit the variance to the climatological value. 

In this work we have studied systems where for sufficiently large observation intervals 
At bs the variables acquire their true climatological mean and variance when the model is 
run. In particular we have not included model error. It would be interesting to see whether 
the variance limiting filter can help to control model error in the case that the free running 
model would produce unrealistically large forecast covariances. Usually numerical schemes 
do underestim ate error covariances, but this is often caused by severe divergence damping 



(jDurranl . Il999l ) which is artificially introduced to the model to control unwanted gravity wave 
activity and to stabilize the numerical scheme. The stabiliziation may be achieved by a much 
smaller amount of divergence damping by implementing the variance limiting constraint in 
the data assimilation procedure. The VLKF would in this case act as an effective adaptive 
damping scheme, counteracting the model error. 
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